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Abstract 

Modular decomposition is fundamental for many important problems in algorithmic graph 
theory including transitive orientation, the recognition of several classes of graphs, and certain 
combinatorial optimization problems. Accordingly, there has been a drive towards a practical, 
linear-time algorithm for the problem. Despite considerable effort, such an algorithm has re- 
mained elusive. The linear-time algorithms to date are impractical and of mainly theoretical 
interest. In this paper we present the first simple, linear-time algorithm to compute the modular 
decomposition tree of an undirected graph. 

1 Introduction 

A natural operation to perform on a graph G is to take one of its vertices, say v, and replace it with 
another graph G', making u's neighbours universal to the vertices of G' . Modular decomposition 
is interested in the inverse operation: finding a set of vertices sharing the same neighbours outside 
the set - that is, finding a module - and contracting this module into a single vertex. A graph's 
modules form a partitive family [2], and as such, define a decomposition scheme for the graph with 
an associated decomposition tree composed of the graph's strong modules - those that don't overlap 
other modules. To compute this modular decomposition tree is to compute the modular decomposition 
(and vice versa); and with its succinct representation of a graph's structure, its computation is often 
a first-step in many algorithms. Indeed, since Gallai first noticed its importance to comparability 
graphs [11], modular decomposition has been established as a fundamental tool in algorithmic graph 
theory. All efficient transitive orientation algorithms make essential use of modular decomposition 
(e.g., [IZ]). It is frequently employed in recognizing different families of graphs, including interval 
graphs [H], permutation graphs [23], and cographs [3j. Furthermore, restricted versions of many 
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combinatorial optimization problems can be efficiently solved using modular decomposition (e.g., 
[S]). While the papers [TSl [IS [20] provide older surveys of its numerous applications, new uses 
continue to be found, such as in the areas of graph drawing [23] and bioinformatics |10| . 

Not surprisingly, the problem of computing the modular decomposition has received considerable 
attention. Much like planarity testing and interval graph recognition, the importance of the problem 
has bent efforts toward a simple and efficient solution. The first polynomial-time algorithm appeared 
in the early 1970's and ran in time 0{n^) [5]. Incremental improvements were made over the years 
- |13[ I21j. for example - culminating in 1994 with the first linear-time algorithms, developed inde- 
pendently by McConnell and Spinrad |16j . and Cournier and Habib [1]. These are unfortunately so 
complex as to be viewed primarily as theoretical contributions, with Spinrad himself hoping they 
would be supplanted by something simpler (pg. 149, [25]). Subsequent algorithms, though, have 
fallen short, either failing to achieve linear-time or appealing to sophisticated data-structure tech- 
niques in doing so. 

The attempts made by [T7] and [7] are illustrative. Both adopt an approach pioneered by Ehren- 
feucht et. al. [9], later improved upon by Dahlhaus [6]. The idea is to pick an arbitrary vertex, say 
X, and recursively compute the modular decomposition tree for its neighbourhood, N{x), and its 
non-neighbourhood, N{x). Any strong module not containing x must be a module of either G[N{x)] 
or G[N(x)], and therefore can be extracted from their recursively computed modular decomposition 
trees. Once extracted, these can then be used to compute the strong modules containing x. The 
two types of modules are then assembled to form the tree. Although this approach is conceptually 
simple, [T7j only managed an 0(n + m log n) implementation, while [3 required advanced union-find 
data structures and complicated charging arguments to achieve linear-time. 

The difficult step in the recursive approach is the computation of the strong modules containing x 
and their incorporation into the tree; in other words, the explicit construction of the tree. Capelle and 
Habib [1] responded by proposing the use of factorizing permutations, a permutation of the vertices 
in which the strong modules appear consecutively. They suggested that a factorizing permutation 
be computed in place of the tree; if the tree is required it can be derived from the permutation 
once, at the end of the algorithm, using the linear-time procedure in [1]. But how to compute the 
permutation? The linear-time algorithm claimed in [12] contains an error that kills its simplicity, 
and the algorithm of [15] has a logn-factor. It seemed factorizing permutations merely traded one 
bottleneck for another. 

The real problem with the two approaches is that they were applied in isolation. This paper shows 
that the two are truly complementary: the recursively computed trees facilitate the computation of 
a factorizing permutation, which in turn facilitates the computation of the modular decomposition 
tree. By unifying the approaches in this way we produce an elegant, linear-time algorithm for the 
modular decomposition, thus realizing a long-standing goal in the area. We combine the best aspects 
of the two methods, maintaining the conceptual simplicity of both. This allows a straightforward 
proof of correctness. The only data-structure employed is an ordered list of trees, and on these, 
only elementary traversals are required. Moreover, to produce the factorizing permutation from 
the recursively computed trees, we introduce a procedure that generalizes partition refinement [22] 
from sets to trees. This and other ideas we develop here can also be applied to the transitive 
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orientation problem: the authors are confident of having developed the first simple, linear-time 
transitive orientation algorithm by doing just this [26| . 

1.1 Preliminaries 

All graphs in this paper are simple and undirected. Connected components will simply be referred 
to as components, while the connected components of the complement will be referred to as co- 
components. We will talk often of an ordered list of trees, which will sometimes be referred to as an 
ordered forest. When we speak of them as defining an ordering of the graph's vertices, we mean a 
pre-ordering of the leaves of each tree in order. Note that sometimes a set of vertices will be referred 
to as a "tree". We do this to streamline the exposition; our intent will become clear. 

The modular decomposition tree will occasionally be referred to as the MD tree. The MD tree 
can be recursively defined as follows: the root of the tree corresponds to the entire graph; if the 
graph is disconnected, the root is called parallel and its children are the MD trees of its components; 
if the graph's complement is disconnected, the root is called series and its children are the MD trees 
of the CO- components; in all other cases the root is called prim^, and its children are the MD trees 
of the graph's maximal modules. Recall that the nodes in this tree are the graph's strong modules, 
which are those that don't overlap others. 

1.2 Outline of the Paper 

The rest of the paper breaks down into four sections. The first provides an overview of the algorithm, 
explaining its operation and how this contributes to its correctness and the ultimate construction 
of the MD tree. In the next section we specify the algorithm in detail and sketch the proof of 
its correctness. An analysis of the algorithm's running time follows. The paper concludes with a 
discussion of our contributions. The appendix contains an example and some omitted proofs. 

2 Overview of the Algorithm 
2.1 Recursion 

The algorithm begins in a familiar way, selecting an arbitrary vertex, x, called the pivot, and placing 
its neighbourhood to its left and its non- neighbourhood to its right, giving us the ordered list of 
trees, N{x),x, N{x). Next, the modular decomposition tree for G[N{x)] is recursively computed. As 
this occurs, the neighbours of N{x) in N{x) are "pulled forward" so that afterwards we have the 
ordered list of trees, T{N{x)), x, Na{x), N]\f{x), where T{N{x)) is the modular decomposition tree for 
G[N{x)], and Na{x) is the subset of N{x) with at least one neighbour in N(x). The algorithm then 
recursively computes the modular decomposition tree for Na_(x), pulling its neighbours in N]\i(x) 
forward in a similar fashion. And so on. Eventually we arrive at the following ordered list of trees: 

*This definition of prime differs somewliat from tliat wliicli normally appears in the literature. 
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T{No),x, T{N^),..^.,T{Nk) , (1) 

where the Ni^s correspond to the distance layers in a breadth-first-search begun from x, and the 
T(A^j)'s are their modular decomposition trees. 

The rest of this paper assumes that the graph is connected and thus each vertex in Ni has an 
edge to Ni-i (or x in the case of A'^o). When the graph is disconnected, the A'j's up to Nii._i along 
with X form one of its connected components. In this case the algorithm builds the MD tree for 
this component as described below, then unifies the result with T{Ni:) under a common root labeled 
parallel. This adds a constant amount of work to each stage. Each stage is defined by a pivot, and 
vertices are only pivots once, so this work is consistent with linear-time. 

2.2 Refinement 

We wish to transform the above ordered list of trees into a factorizing permutation that will help 
build the modular decomposition tree. We begin doing so by refining the trees using the active edges: 

Definition 2.1. An edge is active if it is incident to x or if its endpoints are in different Ni's. 

Refinement is a natural generalization of partition refinement from sets to trees, or equivalently, 
from sets to multi-sets. We process each vertex in turn and use its incident active edges to refine 
the trees other than its own. The process amounts to a simple recursive marking procedure and is 
specified in detail in section [3Tl 

To see how refinement moves us toward a factorizing permutation, first consider a strong module 
not containing x, say M. Notice that for some Ni, we have M Q Ni, with M a module of G[Ni]. A 
theorem of [H] says that either M is a strong module in G[Ni], and thus an internal node in T{Ni), 
or it is the union of siblings in T{Ni). In the former case, the node corresponding to M will be 
unaffected by refinement; in the latter case, refinement will group the siblings under a new internal 
node inserted into T(iVj). Thus: 

Lemma 2.2 (Proved in section [3T]) . The strong modules not containing x appear consecutively after 
refinement. 

We are not so fortunate with the strong modules containing x, although refinement does get 
them close to appearing consecutively. As described above, refinement groups siblings under new 
nodes. When those siblings are at depth-1, however, instead of making that new node the child of 
the siblings' former parent, the new node is made the root of its own tree in our ordered list - the 
siblings' old tree is effectively split. The intuition here comes from the special role played by the 
( CO-) components of the G[iVj]'s and their placement within the T(A'j)'s: 

Proposition 2.3 (Proved in the appendix). If C is a co-component of G[Nq] and M is a strong 
module containing x, then either C d M or Cr\M = $. Similarly for C a component of G[Ni],i > 0. 
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Of course, the (co-) components are either the roots of the T(A'j)'s or are the nodes at depth-1. 
The result of this splitting is that the strong modules containing x will be "bound" by the trees in 
our ordered forest: 

Lemma 2.4 (Proved in section [3TT]) . Let T^, . . . ,Ti,x,Tl, . . . be the ordered forest resulting from 
refinement, and let M be a strong module containing x. Then there are bounding trees Tj and Tj 
such that, M D T^^i U • • • U Ti U {x} U r{ U • • • U Tj_i and M C Tj U • • • U Ti U {x} U r{ U • • • U Tj. 

2.3 Promotion 

When siblings are grouped under a new node during refinement it is because a vertex in a different 
tree is adjacent to them but not their other siblings. The siblings' former parent cannot therefore be 
a module; this is also true of all their ancestors. Refinement accounts for this by marking these nodes 
for deletion. When refinement has finished, the nodes without marked children will correspond to 
the strong modules not containing x. Promotion is the process of deleting all the marked nodes with 
marked children - internal nodes are "promoted" upward as their ancestors are deleted - leaving 
only the strong modules not containing x. 

The real benefit of promotion however is that it gives us the desired factorizing permutation. The 
strong modules not containing x are left intact and are therefore consecutive. But now the strong 
modules containing x will also be consecutive: as nodes are deleted from these modules' bounding 
trees, the ones that remain and that are in the module are placed next to the rest of the module. 
Section [312] details the procedure, a simple depth-first traversal of our ordered forest. So with nothing 
more than elementary traversals of our ordered forest we arrive at a factorizing permutation: 

Lemma 2.5 (Proved in section 13. 2p . The ordered forest that results from promotion provides a 
factorizing permutation. 

2.4 Assembly 

In fact, promotion gives us much more than a factorizing permutation: we have an ordered list of 
trees whose nodes (excepting x) correspond to the strong modules not containing x; moreover, each 
of these strong modules is itself properly decomposed (their parts were originally in their respective 
T{Ni)'s, and neither refinement nor promotion changes this). What remains, then, is to identify 
the strong modules containing x, determine the trees in our list constituting them, then use this 
information to assemble the modular decomposition tree. This was the part that proved difficult for 
the previous recursive algorithms. Our factorizing permutation makes it easy. 

With a factorizing permutation we know the strong modules containing x are nested: 



Since our ordered forest consists of the strong modules not containing x, no tree in it overlaps 
these brackets. So to build the MD tree, it suffices to insert the brackets between the trees in our list: 
once this is done, a node is made for each pair of brackets and a "spine" for the MD tree is built; to 
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this we merely affix the trees in our Ust according to the placement of the brackets. A simple greedy 
algorithm, described in section [331 inserts the brackets. In this way the modular decomposition tree 
is built with minimal effort. 

3 Details and Correctness 
3.1 Refinement 

The refinement process described in the overview is given by algorithm [TJ 



Algorithm 1: Refinement of the ordered list of trees ([T]) by the active edges 

foreach vertex v do 

Let a{v) be its incident active edges; 

Refine the list of trees using a{v) according to algorithm [21 such that: 
if V is to X 's left then 

I refine using left splits, and when a node is marked, mark it with "left" ; 
else if V is to X 's right and refines a tree to x 's left then 

I refine using left splits, and when a node is marked, mark it with "left" ; 
else if V is to X 's right and refines a tree to x 's right then 

I refine using right splits, and when a node is marked, mark it with "right"; 
end 
end 



Below we sketch the proof of lemmas 12.21 and 12.41 For the former we actually prove something 
slightly stronger from which lemma [2^2] follows immediately: 

Lemma 3.1. The nodes in the ordered list of trees resulting from refinement that do not have marked 
children correspond exactly to the strong modules containing x. 

Proof. [Sketch] Let M be a strong module not containing x. As stated in the overview, M must be 
entirely contained in some Ni, and it must be a module of G[A^j]. A theorem of [19] guarantees that M 
is either a node in T(Ni) or the union of children, say ci, . . . , c^, of a series or parallel node in T{Ni). 
Appealing to algorithm [H we see that in the former case it remains a node throughout refinement 
and none of its children are ever marked, since each vertex outside T{Ni) is either universal to, or 
isolated from, the node. Algorithm [T] also makes clear that in the latter case the children will remain 
siblings throughout refinement, and will not be marked at any time, since, again, each refining vertex 
is either universal to them or isolated from them. So for contradiction, assume that after refinement 
the Cj's have a sibling c different from them. Inspecting algorithm [H we see that c must have been 
a sibling of the Cj's in T{Ni), and that c and the Cj's must have the same set of neighbours outside 
Ni. Hence, c U ci is a module overlapping ci U • • • U c^, contradicting the latter being strong. 

For the converse, consider a node without any marked children, and suppose N was formed 
from the refinement of T{Ni). Clearly, the vertices of N have the same neighbours outside T{Ni). 
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Algorithm 2: Refinement of an ordered list of trees by the set X 
Let Ti, . . . , Tfc be the maximal subtrees in the forest whose leaves are all in X] 
Let Pi, . . . , be the set of parents of the Tj's; 
foreach non-prime P-i do 

Let A be the set of Pj's children amongst the Tj's, and B its remaining children; 

Let Ta either be the single tree in A or the tree formed by unifying the trees in A under a 

common root, and define symmetrically; 

Assign Pj's label to Ta and T;,; 

if Pi is a root then 

I Replace Pj in the forest with either Ta,Ti, (left split) or Tb,Ta (right split) 
else 

I Replace the children of Pj with Ta and Tb; 
end 

Mark the roots of Ta and Tb as well as all their ancestors; 
end 

foreach prime Pi do 

I Mark Pj as well as all of its children and all of its ancestors; 
end 



By algorithm[Tl if N is prime, it existed in T(Aj) and so has the same neighbours within T{Ni). This 
is also true when N is not prime, since its children must have been children of the same non-prime 
node in T(Aj). Hence, each node with unmarked children is a module. If the node existed in T{Ni) 
then it is clearly strong. If it is new, a simple case analysis shows that no other module can overlap 
it, since two overlapping modules must be a module themselves. □ 

Proof. [Sketch of lemma l2.4j We prove this by induction on the number of vertices refining. Prior 
to refinement we have the ordered list of trees T{Nq),x, T{Ni), . . . , T(Afc). In the appendix we show 
that if M n A^j 7^ for some i > 1, then M = V. Thus, the lemma holds prior to refinement since 
T{Nq) and either T{Ni) or T(Afc) can be taken as the bounding trees. So suppose there are such 
bounding trees Tj and Tj after some number of vertices have refined; now consider what happens 
after the next vertex refines. Clearly we need only focus on T and Tj; we'll argue the case for T, 
with the case for Tj being similar. 

Now, if Tj is not split we are done, so assume Tj is split and replaced by the trees Ta, Tb in order. 
Let V be the vertex doing the refining and observe that v is universal to the leaves of Ta and not 
universal to the leaves of Tb ; additionally, we must have v (z Ni. If u E M as well, then v is universal 
to the portion of Tj outside M and hence we take Ta as the new left-bounding tree. li v ^ M, then 
it is isolated from the portion of Ti in M, and so we take Tb as the new left-bounding tree in this 
case. □ 
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3.2 Promotion 



The promotion process is given by algorithm [3l Below we sketch the proof of lemma 12.51 The key 
here is that refinement distinguishes between nodes marked "left" and "right" and promotion handles 
these cases differently. 



Algorithm 3: The promotion algorithm 

while there is a root r with a child c both marked by "left" do 
I Remove from r the subtree rooted at c and place it just before r; 
end 

while there is a root r with a child c both marked by "right" do 
I Remove from r the subtree rooted at c and place it just after r; 
end 

Delete all marked roots in the forest with one child, replacing them with that child; 
Delete all marked roots in the forest with no children; 
Remove all marks; 



Proof. [Sketch of lemma 12. 5j By lemma [3TT] and inspection of algorithm [3l we see that the strong 
modules not containing x will appear consecutively after promotion. 

Let M be a strong module containing x. Let Tj and Tj be the bounding trees provided by 
lemma [231 It suffices to show that promotion deletes nodes in such a way as to place the portions 
of Ti and Tj that are in M next to the other vertices in M. We'll focus on Tj, with the case for Tj 
following similarly. 

In the proof of lemma [2^ we observed that if M n A^j 7^ for some i > 1, then M = V. As such, 
we'll assume Tj is composed of vertices in Ni. If Tj only contains vertices in M, then clearly we 
are done since promotion does not rearrange trees in our ordered list. So assume Tj contains some 
vertices in M and some outside M. By proposition 12.31 this means it contains vertices in at least 
two different components of G[Ni], say C and C with C C M and C D M = 0. Now, C and C 
were siblings at depth-1 in T(iVi), and by assumption, some portion of each remains in the same tree 
after refinement. Appealing to algorithm [H we see that this is only possible if all vertices in C and 
C remain in the same tree after refinement; that is, C and C must still be siblings after refinement, 
which means they remained siblings throughout refinement. 

If both C and C share the same neighbours outside Ni, then C U C" is a module overlapping M, 
contradicting M being strong. It follows that at least one of C and C is marked "left" or "right" 
(or both). We now consider the cases: 

Case 1: Assume C is marked by "left". This means a vertex in Nq is adjacent to some but 
not all vertices in C"; let v be the first such vertex. Note that v ^ M if it is adjacent to some of C"; 
thus, V is universal to C. But we remarked above that C and C had the same parents throughout 
refinement; so at the time v refined, it would have split C away from C", contradicting their being 
siblings afterwards. This case is therefore impossible. 



8 



Case 2: Assume C is marked by "right". Observe that no vertex in C can be adjacent to a 
vertex in Ni, i > 1, since such vertices are outside M and not adjacent to x. Thus C cannot be 
marked by "right". Thus, promotion places the vertices of C to the right of those in C. 

Case 3: Assume C is not marked by a split. Then C must be marked by a split, as argued 
above, and as seen in case 2, it must be a left-split that marks it. Thus, promotion places the vertices 
of C to the left of those of C. 

In all cases, promotion puts C to the left of C . Since C and C were chosen arbitrarily, we can 
conclude that the vertices of M appear consecutively. □ 

3.3 Assembly 

After promotion we are left with an ordered list of trees representing a factorizing permutation. As 
we explained in the overview, the problem of constructing the modular decomposition tree reduces 
to placing brackets between these trees in a way that delineates the strong modules containing x. 
We can actually simplify things even further. 

Recall from the end of section 12.21 that the (co-) components of the G[A'^j]'s appear consecu- 
tively prior to refinement. A look at algorithms [T] and [3] confirms that this holds after promotion 
as well. Our ordered list of trees can therefore be viewed as an ordered list of (co-)components: 
Ck, . . . , Ci, x, , . . . , C^, where the Cj's correspond to the co-components of G[A''o], and the C-'s 
correspond to the components of the G[A'j]'s, j > 0. Proposition 12.31 allows us to place the brackets 
between these instead. A simple greedy procedure based on the following lemma does this easily: 

Lemma 3.2 (Proved in the appendix.). Let M be the smallest strong module containing x. Then 
M satisfies one of the following three conditions: 

(i) M is the maximally contiguous module containing x and no C[ (in which case M is series); 

(a) M is the maximally contiguous module containing x and no Ci, and only 's in Ni with no 
edge to their right (in which case M is parallel); 

(Hi) M is the minimally contiguous module containing x and at least Ci and C[ (in which case M 
is prime). 

To use the lemma we first determine if any vertex in a in A'^i has a neighbour in a Nj, j > 1 
(as required by (ii) above). Next, we determine the /i- values of the (co-)components: for Ci this 
is defined as follows: let Cj be the co-component with smallest index such that C'^,...,Cj are all 
isolated from Ci, then fi{Ci) is x if j = 1 and Cj_i otherwise; the /i-values for the C-'s are defined 
symmetrically. These /i-values help the procedure determine when a module is formed. 

Given this information, the procedure can follow the lemma directly, first trying for a series 
module, then a parallel module if this fails, and finally a prime module failing this. Series and parallel 
modules are attempted by comparing the /x-values of the (co-) components against x and maximally 
adding those for which the two are equal. Prime modules are formed by first adding Ci and C[, 
and then iteratively applying the following rule: once a Ci is added, so too must be C(, . . . , ^{Ci) 



9 



(symmetrically for a C[ being added), stopping when the rule can no longer be applied. Once a 
module is found, brackets are placed accordingly and the process begins anew, treating the just 
formed module as though it were x. 

4 Running Time and Implementation 

4.1 Recursion 

In order to effect the partitioning required of the recursion, we need to traverse the pivot's adjacency 
list in its entirety. However, each vertex is a pivot exactly once during the algorithm, so this is 
consistent with linear-time. 

We will need to isolate the incident active edges of each vertex so that refinement, promotion, 
and assembly can be performed efficiently; this can be done during the recursion. Initially we assume 
all vertices are marked as unvisited and that each has associated with it an empty list denoted by 
a (which will be used to store the incident active edges). As pivots are chosen during the recursion 
they are marked as visited. When a pivot's adjacency list is traversed, it is appended to the a-list 
of all its visited neighbours. Thus, after recursion the a-lists of each vertex in Ni will correspond to 
their incident active edges to Ni^i. The rest of their active edges can then be added by traversing 
the a-list of each vertex, and appending vertices to the other a-lists in the obvious way. At the end 
of each stage the a-lists must be cleared to satisfy our induction hypothesis. We can thus assume 
that the active edges at each stage can be isolated at the cost of work proportional to their number. 
Notice that each edge is active precisely once during the algorithm, so this effort is consistent with 
linear-time overall. 

4.2 Refinement 

A simple recursive marking procedure finds the maximal subtrees required by algorithm [2l All nodes 
in our trees have at least two children, so the sizes of these subtrees are linear in the number of their 
leaves, which is equal to the number of incident active edges of the vertex refining. Notice that each 
vertex has at least one incident active edge. Thus, finding these trees (and the constant amount 
of work required afterward) is proportional to the number of active edges at each stage and so is 
consistent with linear-time. 

The children of a prime node need only be marked once, and the ancestors of a node need only be 
marked twice (once each for "left" and "right"). The time for this marking is therefore proportional 
to the size of our ordered forest, which is linear in the number of its leaves, which is linear in the 
number of active edges (since each leaf has at least one active edge), and hence consistent with 
linear-time overall. 

4.3 Promotion 

If we implement promotion in a depth-first manner, we see that it requires no more than a single 
traversal of our ordered forest, which as just observed, is consistent with linear-time. 
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4.4 Assembly 

Identifying the (co-)components requires at most two traversals of the forest: one prior to refinement 
to mark them and one after promotion to retrieve them. Determining if a C- has an edge to its right 
needs only a traversal of each vertex's a-list. Computing the /i-values of the (co-)components can be 
accomplished by processing each vertex in order and traversing its a-list. All this work is therefore 
consistent with linear-time. 

The placement of the brackets amounts to a single traversal of the list of (co-) components, each 
of which contains an active edge, and so is consistent with linear-time. 

The final assembly of the tree can be done merely by traversing our ordered forest, and is therefore 
consistent with linear-time. 

5 Conclusion 

Like other algorithmic problems of comparable importance, research in modular decomposition has 
focused on finding a simple, efficient algorithm for its computation. This paper finally provides 
such an algorithm. There have been many previous attempts, but all have either failed to achieve 
linear-time or were complicated to the point of being impractical. Our algorithm suffers from no such 
shortcomings. Its elegance derives from unifying two existing approaches, utilizing the best elements 
from each. The unification is effected through the introduction of a new refinement technique which 
generalizes partition refinement from sets to trees. To our knowledge, no similar type of procedure 
has so far been formalized. With so many applications for traditional partition refinement (see, e.g., 
[14j). the authors are hopeful this tree refinement will find further application in the near future, 
especially given the breakthrough it proves to be here. Already, it and other ideas from this paper 
have been applied to the transitive orientation problem, with the authors confident of having achieved 
the first simple, linear-time algorithm for transitively orienting a graph. 

References 

[1] C. Capelle, M. Habib, and F. de Montgolfier. Graph decompositions and factorizing permuta- 
tions. Discrete Mathematics and Theoretical Computer Science, 5:55-70, 2002. 

[2] M. Chein, M. Habib, and M.C. Maurer. Partitive hypergraphs. Discrete Mathematics, 37:35-50, 
1981. 

[3] D.G. Corneil, Y. Perl, and L.K. Stewart. A linear recognition algorithm for cographs. SIAM 

Journal of Computing, 14:926-934, 1985. 
[4] A. Cournier and M. Habib. A new linear algorithm of modular decomposition. In Trees in 

algebra and programming ( CAAP), volume 787 of Lecture Notes in Computer Science, pages 

68-84, 1994. 

[5] D.D. Cowan, L.O. James, and R.G. Stanton. Graph decomposition for undirected graphs. In 3rd 

S-E Conference on Combinatorics, Graph Theory and Computing, Utilitas Math, pages 281-290, 
1972. 



11 



[6] E. Dahlhaus. Efficient parallel algorithms for cographs and distance hereditary graphs. Discrete 

Applied Mathematics, 57:29-54, 1995. 
[7] E. Dahlhaus, J. Gustedt, and R.M. McConnell. Efficient and practical algorithm for sequential 

modular decomposition algorithm. Journal of Algorithms, 41(2):360-387, 2001. 
[8] Celina M. H. de Figueiredo and Frederic Maffray. Optimizing bull-free perfect graphs. SIAM 

J. Discret. Math., 18(2):226-240, 2005. 
[9] A. Ehrenfeucht, H.N. Gabow, R.M. McConnell, and S.L. Sullivan. An O(n^) divide-and-conquer 
algorithm for the prime tree decomposition of two-structures and modular decomposition of 
graphs. Journal of Algorithms, 16:283-294, 1994. 

[10] J. Gagneur, R. Krause, T. Bouwmeester, and G. Casari. Modular decomposition of protein- 
protein interaction networks. Genome Biology, 5(8):R57, 2004. 

[11] T. Gallai. Transitiv orientierbare graphen. Acta Math. Acad. Sci. Hungar., 18:25-66, 1967. 

[12] M. Habib, F. de Montgolfier, and C. Paul. A simple linear-time modular decomposition al- 
gorithm for graphs, using order extension. In Scandinavian Workshop on Algorithm Theory 
(SWAT), volume 3111 of Lecture Notes in Computer Science, pages 187-198, 2004. 

[13] M. Habib and M.C. Maurer. On the x-join decomposition of undirected graphs. Discrete Applied 
Mathematics, 1:201-207, 1979. 

[14] M. Habib, R.M. McConnell, C. Paul, and L. Viennot. Lex-bfs and partition refinement, with 
applications to transitive orientation, interval graph recognition and consecutive ones testing. 
Theoretical Computer Science, 234:59-84, 2000. 

[15] M. Habib, C. Paul, and L. Viennot. A synthesis on partition refinement: a useful routine for 
strings, graphs, boolean matrices and automata. In 15th Symposium on Theoretical Aspect of 
Computer Science (STACS), volume 1373 of Lecture Notes in Computer Science, pages 25-38, 
1998. 

[16] R.M. McConnell and J. Spinrad. Linear-time modular decomposition and efficient transitive ori- 
entation of comparability graphs. In 5th Annual ACM-SLAM Symposium on Discrete Algorithms 
(SODA), pages 536-545, 1994. 

[17] R.M. McConnell and J. Spinrad. Ordered vertex partitioning. Discrete Mathematics and The- 
oretical Computer Science, 4:45-60, 2000. 

[18] R.H. Mohring. Algorithmic aspects of comparability graphs and interval graphs. In I. Rival, 
editor, Craphs and Orders, pages 41-101. D. Reidel, Boston, 1985. 

[19] R.H. Mohring. Algorithmic aspects of the substitution decomposition in optimization over 
relations, set systems and boolean functions. Annals of Operations Research, 4:195-225, 1985. 

[20] R.H. Mohring and F.J. Radermacher. Substitution decomposition for discrete structures and 
connections with cominatorial optimization. Annals of Discrete Mathematics, 19:257-356, 1984. 

[21] J.H. Muller and J. Spinrad. Incremental modular decomposition. Journal of the ACM, 36(1):1- 
19, 1989. 

[22] R. Paige and R.E. Tarjan. Three partition refinement algorithms. SLAM J. Comput., 16(6):973- 
989, 1987. 

[23] C. Papadopoulos and C. Voglis. Drawing graphs using modular decomposition. In Patrick Healy 
and Nikola S. Nikolov, editors, Graph Drawing, Limerick, Lreland, September 12-14, 2005, pages 



12 



pp. 343-354. Springer, 2006. 

[24] A. Pnueli, S. Even, and A. Lempel. Transitive orientation of graphs and identification of per- 
mutation graphs. Canad. J. Math., 23:160-175, 1971. 

[25] J. Spinrad. Efficient graph representation, volume 19 of Fields Institute Monographs. American 
Mathematical Society, 2003. 

[26] M. Tedder, D.G. Corneil, M. Habib, and C. Paul. Simple, Linear-time Transitive Orientation. 
(In preparation). 



13 



Appendix 

Omitted Proofs 

We prove proposition 12.31 from the overview: 

Proposition 2.3. If C is a co-component of G[Nq] and M is a strong module containing x, then 
either C C M or C r\ M = %. Similarly for C a component of G[Ni\,i > 0. 

Proof. Let C be a co-component of ^[A'^o] and M a strong module containing x. Obviously, M—C ^ 
because of x. Suppose for contradiction that C n M 7^ and C — M ^ %, and say Ci = C n M and 
C2 = C — Ci. Since x G M and C C N(x), we must have a join between Ci and C2. But then this 
contradicts C being a co-component of G[Nq]. The case where C is a component of some G[Ni],i > 
is exactly symmetric. □ 

The following fact was used in the proof of lemmas 12.41 and 12. 5t 

Proposition 5.1. If M is a strong module containing x, and M Ni ^ $ for some i > 1, then 
M = V. 

Proof. Let be a vertex in Mr\Ni, i> 1. First we show that N[x) C M. Suppose for contradiction 
there is some v € N[x) — M. Note that v must be universal to M, which is impossible since it 
cannot be adjacent to u. Next we show A^i C M. Suppose for contradiction there is a g G A^i — M. 
Note that q must be isolated from M. But this is impossible as q (z Ni and therefore has at least 
one neighbour in N{x) C M. Thus A^i C M. We simply need to progressively apply a symmetric 
argument to show that N2 C M, N3 C M, . . .. □ 

We now prove lemma [3^21 

Lemma 3.2. Let M be the smallest strong module containing x. Then M satisfies one of the 
following three conditions: 

(i) M is the maximally contiguous module containing x and no G[ (in which case M is series); 

(a) M is the maximally contiguous module containing x and no Ci, and only Cj 's in Ni with no 
edge to their right (in which case M is parallel); 

(Hi) M is the minimally contiguous module containing x and at least Ci and G[ (in which case M 
is prime). 

Proof. By proposition 12.31 each (co-)component is either entirely contained in M or entirely outside 
M. Hence, we need only consider modules including x and formed by including or excluding these 
(co-)components. Of course, the (co)-components constituting M must be contiguous because our 
ordering represents a factorizing permutation. 



14 



We first show that if M is series, it cannot contain any C[. So assume for contradiction that 
M contains a C[. We can therefore assume that M contains C{, since our ordering is a factorizing 
permutation. However, because M is the smallest strong module containing x, and M is series, we 
must have x as a co-component in the graph induced on M. In other words, x must be universal to 
the vertices in the graph induced on M. With C( in M, this is of course impossible. 

So if M is series, it can contain no C[. We now must show that M must be the maximally 
contiguous module with this property. Let M' be the maximally contiguous module containing no 
C[. Note that by the maximality of M' and M being strong, we must have M' ^ M. Assume for 
contradiction that M' ^ M. Then there must be a Cj G M' — M. Of course, we must also have 
some Cj G M. Observe that Ci U Cj is a module. Moreover, it is a module that overlaps M (recall 
that X G M as well), contradicting M being strong. Thus, we must have M = M', and therefore M 
is the maximally contiguous module not containing any C'^. 

Now consider the case where M is parallel. First we show that M cannot contain any C- in a 
Nj,j > 1. In this case, proposition 15. II savs that M = V. If M is to be parallel the graph induced on 
it must be disconnected, which is impossible since we assumed in section 12.11 that the graphs in this 
paper were all connected. To show that M cannot contain any C- from A^i that contains an edge to 
its right, say to v £ Nj,j > 1, assume otherwise for contradiction. We must then have v M since 
it is adjacent to some vertex in C'^ but not to x. But then the component of v must be added to M, 
by proposition 12.31 which we just saw is impossible. 

We next need to show that when M is parallel it is the maximally contiguous module containing 
only components of G[A'^i] without edges to their right. We saw above that it can only contain 
components of G[A'^i] without edges to their right, and earlier that it must be contiguous, so we need 
only show that it is maximally so. Let M' be the maximally contiguous module only containing 
C-'s in A'^i with no edge to their right. Observe that since M is strong and M' maximal, we must 
have M' D M. Assume for contradiction that M' ^ M. Thus, there is a C- € M' — M, and also a 
Cj € M. Notice that C- U Cj is a module that overlaps M (recall that x S M as well), contradicting 
it being a strong module. 

So now assume M is prime. In this case the graph induced by M cannot be disconnected, nor 
can its complement be disconnected. As such, M cannot consist entirely of Q-'s, nor can it consist 
entirely of Cj's. Because this is a factorizing permutation, we must then have Ci C M and C[ C M. 
Hence, we need only show that M is the minimally contiguous module containing Ci and C(. 

Let M' be the minimally contiguous module containing Ci and C[. Since M is strong and M' 
minimal, we must have M' C M. Assume for contradiction that M' ^ M. Recall the theorem of 
|19| employed earlier, saying that when a module is not strong, it is the union of (strong) siblings 
in the MD tree. These siblings must be descendants of M in the MD tree. Moreover, x must be a 
descendant of one of these siblings. But if M is the smallest strong module containing x, we must 
have M as the parent of x in the MD tree, which gives us the desired contradiction. □ 



15 



An Example 

A graph G is described in figure [T] by the modular decomposition tree pictured therein. In it, prime 
nodes are labeled by the graph their children induce, while series nodes are labeled by 1 and parallel 
nodes by 0, following the cograph convention. We demonstrate how our algorithm operates when 
input G. 

Assume x is the vertex chosen to start the algorithm. In this case, N[x) = Nq = {c, d, e, a}, 
Ni = {f,g,h,i,b,j,k,i,m,n,p,q}, and = {f}- Figure [2] displays the modular decomposition 
trees recursively computed: T{Nq),T{Ni), and T{N2). 

We use a{u) to denote the list of incident active edges of the vertex u. The active edges in our 
example are summarized in table [H Using the active edges, the algorithm refines each tree in the 
forest; the result is displayed in figure [3l The shading in the diagram corresponds to the marks on 
the nodes: horizontal shading for "left" marks, vertical shading for "right "marks, and cross-hatched 
shading when a node has both "left" and "right" marks. Promotion is applied to these marked 
nodes, with the result being figure [H 

Read the trees of figure H] from left-to-right and label them T2, Ti, T{, Tg, Tg, T4, Tg, Tg. We are 
interested in the following ordered list of trees: 

T2,Ti,x,T{,T^,T.;,XXX (2) 

We now rephrase this list in terms of the (co-)components of the G[A^j]'s as we described in the 
paper. Prom figure [2l we see that G[A''o] is series, and has co-components {a} and {c,d,e}. We 
also see that G[A''i] is parallel, and has the components {b}, {j}, {i, g, h, f}, {k, i}, and {q,m,n,p}. 
Lastly, the figure tells us that {r} is the only component of G[A^2]- Observe from figure U] that each 
of these (co-)components appears consecutively after promotion. Reading these from left-to-right we 
can view the list of ([2]) as: 

G2, Ci, X, Ci, C2, C3, C4, C5, Gq, 

where G2 = {a},Gi = {c,d,e},G[ = {5, /i, /, i}, = {b},G!, = {k,l},G'^ = = {q,m,n,p}, 

and G'q = {r}. 

The algorithm must now insert brackets between these (co-)components in such a way as to 
delineate the strong modules containing x. To do this we first determine which of the C-'s have 
edges to their right. In this case only Cg = {g, m, n,p} does, by virtue of q being adjacent to r. Next 
we must calculate the /x-values for each (co-)component, as was described in the paper. These values 
are summarized in table [21 We can now proceed to introduce the brackets. 

The first set of brackets will correspond to the smallest strong module containing x. We first 
try a series module by comparing /u(Ci) with x. Notice that they are not equal, so a series module 
cannot be formed. Next, a parallel module is attempted by comparing /u(C() with x. Once more, 
they are not equal, and so a parallel module cannot be formed. We now know that M must be prime 
and include Gi and C(. Therefore, C(, . . . ,//(Ci) = G[ must be included as well (it already is), and 
so must be Ci, . . . ,/i(C{) = Ci as well (it already is). Thus, Ci U {x} U C( represents the minimal 
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contiguous module containing x and Ci and C[ . We have therefore found the smahest strong module 
containing x. We bracket it accordingly and move on: 

C2, [Ci,x, C'l], C2, C3, C4, C5, Cg. 

We once more try for a series module, but this time compare ^{02) with C[ (the last component 
in the previous module). These are not equal so a series module cannot be formed. So we try for 
a parallel module, comparing ^{C!^) with Ci (the last co-component in the previous module). Here 
they are equal, so perhaps a parallel module can be formed. We must also check that C2 does not 
have an edge to its right, which it doesn't (recall that only C5 does), and so a parallel module can 
in fact be formed. We now maximally add components in the same way. Doing so allows us to add 
C3 and C4. Although, fi{C'^) also equals Ci, it cannot be added because it has an edge to its right. 
We bracket this module accordingly and move on: 

C2, [[Ci,x, C[], C2, C3, C4], C5, Cg. 

Again, we first try for a series module. Here /i(C2) does not equal C4 so none can be formed. No 
parallel module can be formed because C5 has an edge to its right. Thus, the module is prime and 
must include both C2 and C5. It thus also includes C[, . . . , /i(C2) = Cg, and so we know this module 
corresponds to the entire graph. We create the necessary brackets: 

[C2, [[Ci,a;, C(], C2, C3, C4], C5, Cg]. 

Based on the above bracketing we can now construct the tree according to the procedure outlined 
in the paper, the result of which is clearly the tree of figure [TJ 



vertex u 


a{u) 


a 


x,b,jj,g,h,i,k,£,m,n,p,q 


c, d, e 


X, i 


b,j, f,g,h,k,e,m,n,p 


a 


i 


a, c, d, e 


Q 


a, r 


r 


Q 



Table 1: The active edges for the graph G after x is chosen to start the algorithm. 
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Figure 1: The modular decomposition tree for a graph G. Prime nodes are labeled by the graphs 
their children induce. Series nodes are labeled by 1 while parallel nodes are labeled by 0, as per the 
cograph convention. 





Figure 3: The trees of figure [2] after the active edges (tabled]) have refined them. Horizontal shading 
represents a node marked by a left split; vertical shading represents a node marked by a right split; 
cross-hatched shading represents a node marked by both left and right splits. 




(Co-) Component 




C2 




Ci 


C[ 


C[ 


Ci 




Ci 




Ci 




Ci 




Ci 




C2 



Table 2: The /i- values. 
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